Fine-Grained Parallel Incomplete LU Factorization

نویسندگان

  • Edmond Chow
  • Aftab Patel
چکیده

This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros in the incomplete factors can be computed in parallel and asynchronously, using one or more sweeps that iteratively improve the accuracy of the factorization. Unlike existing parallel algorithms, the new algorithm does not depend on reordering the matrix. Numerical tests show that very few sweeps are needed to construct a factorization that is an effective preconditioner.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault tolerant variants of the fine-grained parallel incomplete LU factorization

This paper presents an investigation into fault tolerance for the fine-grained parallel algorithm for computing an incomplete LU factorization. Results concerning the convergence of the algorithm with respect to the occurrence of faults, and the impact of any sub-optimality in the produced incomplete factors in Krylov subspace solvers are given. Numerical tests show that the simple algorithmic ...

متن کامل

Exploiting Fine-Grain Parallelism in Recursive LU Factorization

The LU factorization is an important numerical algorithm for solving system of linear equations in science and engineering and is characteristic of many dense linear algebra computations. It has even become the de facto numerical algorithm implemented within the LINPACK benchmark to rank the most powerful supercomputers in the world, collected in the TOP500 website. In this context, the challen...

متن کامل

Optimization of A Fine-grained BILU by CUDA Inter-block Synchronization

A fine-grained block incomplete LU (FGBILU) factorization for solving large-scale block-sparse linear systems resulting from coupled PDE systems with n equations has been recently developed for massively parallel heterogeneous architectures, such as generalpurpose graphics processing units (GPGPUs). A straightforward one-sweep wavefront ordering is combined with element-wise block submatrix ope...

متن کامل

Level-based Incomplete LU Factorization: Graph Model and Algorithms

A graph theoretic process that models level-based, incomplete LU factorization (ILU(`)) of sparse unsymmetric matrices is developed. The model leads to two incomplete fill path theorems that are generalizations of the original fill path theorem of Rose, Tarjan, and Lueker. Our S-level incomplete fill path theorem leads to the development of new, embarrassingly parallel algorithms for computing ...

متن کامل

Computing a block incomplete LU preconditioner as the by-product of block left-looking A-biconjugation process

In this paper, we present a block version of incomplete LU preconditioner which is computed as the by-product of block A-biconjugation process. The pivot entries of this block preconditioner are one by one or two by two blocks. The L and U factors of this block preconditioner are computed separately. The block pivot selection of this preconditioner is inherited from one of the block versions of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Scientific Computing

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2015